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Abstract 

We establish a theory of quantuni-to-classical rate distortion coding. In this setting, a 
sender Ahee has many copies of a quantum information source. Her goal is to transmit classical 
information about the source, obtained by performing a measurement on it, to a receiver Bob, 
up to some specified level of distortion. We derive a single-letter formula for the minimum rate 
of classical communication needed for this task. We also evaluate this rate in the case in which 
Bob has some quantum side information about the source. Our results imply that, in general, 
Alice's best strategy is a non-classical one, in which she performs a collective measurement on 
successive outputs of the source. 

1 Introduction 

A fundamental task in quantum information theory is the reliable compression of information 
emitted by a quantum information source, to enable efficient storage of the data. Schumacher [15] 
proved that, for a memoryless source, the optimal rate of lossless data compression (in which the 
original data is recovered perfectly in the limit of asymptotically many copies of the source) is given 
by the von Neumann entropy of the source. The corresponding rate for a classical source is given 
by its Shannon entropy |18j . 
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In realistic applications it may be possible, however, to tolerate imperfect recovery of the signals, 
and hence allow for a bounded distortion of the original information. In fact, this may even be 
necessitated by the lack of sufficient storage. These considerations have led to the development of 
rate distortion theory [3], which is the theory of lossy data compression. The fundamental results of 
classical rate distortion theory are attributed to Shannon |19] and date back to 1948. Its quantum 
counterpart was introduced by Barnum P] and developed further in Refs. [SIE]. Recently, Datta 
et al. identified a regularized expression for the quantum rate distortion function as well as a 
single-letter expression for the entanglement-assisted quantum rate distortion function [8j. 

In this paper, we consider the situation in which a party (say, Alice) obtains many copies of a 
quantum information source described by a quantum state, and she already has a description of 
the source in terms of its density operator. She is only allowed to perform measurements on the 
source. Her aim is to suitably compress the classical data resulting from her measurements and 
send it to another party (say. Bob) such that, upon decompression, the data recovered by Bob has 
a fixed level of distortion from the quantum source (specified by a suitable distortion observable). 
Alice is allowed to perform any measurement that she wishes on the source states to produce a 
classical sequence, with the requirement that the average symbol-wise distortion of this sequence 
be no larger than some prescribed amount. Analogous to previous terminology used in quantum 
information theory, we refer to this as quantum-to- classical rate distortion theory, since it deals 
with an analysis of the trade-off between the optimal rate of compression of the data obtained by 
measurements on the quantum source, and the allowed distortion on the recovered classical data. 
This trade-off is quantified by the quantum-to-classical rate-distortion function. 

Another way of emphasizing the relevance of quantum-to-classical rate distortion theory is by 
adopting the perspective that all classical data arises from a measurement of a quantum state. This 
is especially important in cases where the source is truly non-classical, such as an atomic decaying 
process or a highly attenuated laserj^ In particular, we can imagine that a memoryless classical 
source arises from an appropriate measurement on the states emitted by a quantum source, and 
the resulting classical data is some description or characterization of the original quantum source. 
Thus, this perspective necessitates a revision of Shannon's rate-distortion theory [TH] by allowing 
for an arbitrary measurement to be performed on the original quantum source. A naive approach to 
this setting would be to measure each individual output of the quantum source, treat the resulting 
classical data as information emitted by a classical source, and then apply Shannon's rate-distortion 
theory to the latter. 

Here, we instead allow for collective measurements on the outputs of the source, and our ap- 
proach is to apply a derandomized measurement compression protocol to achieve this task |24| . 
We find a single-letter formula for the quantum-to-classical rate distortion function, expressed as a 
minimization of the Holevo quantity over all maps that meet the distortion constraint. Our result 
implies that, in general, a quantum strategy is needed to achieve optimal compression rates and 
that Shannon's rate-distortion theory is insufficient in this setting. This result is analogous to 
the fact that collective measurements are needed in general in the well-known Holevo-Schumacher- 
Westmoreland theorem |12^ [T7] regarding classical communication over quantum channels (see 
Ref. [llj for an explicit example of a channel for which collective measurements outperform classi- 
cal strategies). 

In the classical setting, the optimal rate of data compression can be reduced if the decoder (Bob) 

^We note that a similar perspective was used to justify the development of quantum-to-classical randomness 
extractors |4j. 
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has some side information at his disposal. The first discovery in this direction is due to Slepian 
and Wolf [20], who showed that the optimal lossless compression rate is given by the entropy of 
the source conditioned on the side information. Wyner and Ziv extended these results to the case 
of lossy classical data compression with classical side information [26]. For the quantum setting, 
one might imagine that quantum side information is available at the decoder. In Ref. [10], Devetak 
and Winter proved that if Bob has quantum side information at his disposal, then the optimal 
lossless compression rate for a classical information source is reduced from the Shannon entropy 
of the source by the Holevo information between the source and the quantum side information. 
The case of lossy classical data compression with quantum side information, which is a quantum 
generalization of the Wyner-Ziv problem, was studied by Luo and Devetak [13j . 

We also study the effect of quantum side information on the above-mentiond quantum-to- 
classical rate distortion function. In particular, we consider the case in which some quantum side 
information about the original quantum source is available to Bob. He is allowed to use this 
information to recover the classical data obtained from Alice's measurements on the source states. 
We also let Alice and Bob share common randomness. In this case, we find a single-letter formula 
for the corresponding quantum-to-classical rate distortion function. One of our assumptions in this 
setting is that the process of compression and decompression only causes a negligible disturbance 
to the quantum side information. This assumption can be justified by the possibility of Bob 
wanting to use the quantum side information in some future protocol. Our result improves upon 
the aforementioned work of Luo and Devetak [13] in the sense that we find a matching single- letter 
converse for this setting. The achievability part of the proof of this theorem exploits measurement 
compression with quantum side information [22]. 

The paper is organized as follows. We summarize some necessary definitions and prerequisites in 
Section [2| and in Section [sj we review the concept of a distortion observable (originally introduced 
in Refs. [251 [5]). In Section[4| we introduce the task of quantum-to-classical rate distortion coding, 
define a suitable distortion observable, and derive an expression for the quantum-to-classical rate 
distortion function. In Section [5} we study quantum-to-classical rate distortion in the presence of 
quantum side information and common randomness. The main results of this paper are given by 
Theorem [3] of Section |4] and Theorems [5] and [6] of Section \5\ 



2 Notations and definitions 

Let B(T-L) denote the algebra of linear operators acting on a finite-dimensional Hilbert space 7i and 
let V{H) denote the set of positive operators of unit trace (states) acting on %. For any given pure 
state IV') GTiwe denote the projector \tp){tp\ simply as ^. The trace distance between two operators 
A and B is given by \\A - B\\-^ = Tt\A-B\, where \C\ = VCW. Throughout this paper we restrict 
our considerations to finite-dimensional Hilbert spaces, and we take the logarithm to base 2. In 
the following we denote a completely positive trace-preserving (CPTP) map M : B{'Ha) — ^ B{1-Lb) 
simply as N^~^^ . Similarly we denote an isometry U : B{'Ha) — ^ B{%b ® T^-e) simply as U^^^^ . 
The identity map on states in V{J-iA) is denoted as idyi. 

The von Neumann entropy of a state p G DiTiA) is defined as H{p) = — Trjplog/j}. In the 
following we use H{A\B)p and I{A; B)p to respectively denote the conditional quantum entropy and 
the quantum mutual information of a bipartite state pab-, and I [A; C\B)f^ to denote the conditional 
quantum mutual information for a tripartite state oabc (see, e.g., Refs. [HI [21]). We also employ 
the following properties of the quantum mutual information: 
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Lemma 1 (Quantum data processing inequality |16|, 121] ) If ujab' = (id^ ® ■^^^^')(^AB, 
where Af^^^ is a CP TP map, then 

I{A-B)^>I{A-B\. (1) 

Lemma 2 (Superadditivity of the quantum mutual information [8]) The mutual informa- 
tion is superadditive in the sense that, for any CPTP map j\A^i^2->-Bi-B2^ 

/ (i?ii?2; BiB2), > I {Ri;Bi)^ + / {R2; B2), , 

where 

and (pR^Ai cind (PR2A2 pure bipartite states. 

In proving our first theorem (Theorem [s] of Section [4]) we make use of the "measurement com- 
pression" theorem (Theorem 2 of Ref. [Mj)- The latter specifies an optimal two-dimensional rate 
region characterizing the resources (namely, common randomness and classical communication) 
needed for an asymptotically faithful simulation of a measurement on a quantum state. For an 
exact statement of the theorem, see Refs. |24| I22j . Here we give a brief description of its content. 
Let denote the purification of a quantum state p G V{'Ha)-, multiple copies of which are in 
Alice's possession. Suppose Alice does a measurement, given by a POVM A = {A^;}, on each of 
the systems in her possession. In the ideal measurement compression protocol, the state of the 
classical registers containing Alice's measurement outcomes and the purifying reference systems R 
is equivalent to many copies of the following state: 

^\x){x\x®Tia{{Ir®^x)Vra]- (2) 

The measurement compression theorem asserts that if Alice and Bob share nH{X\R)a bits of 
common randomness, then it is possible for them to simulate the measurement A®" on the state p®" 
with approximately nI{X; R)u bits of classical communication, for n large enough. The simulation 
becomes faithful in the limit n — t- 00, in the sense that a verifying party who possesses the classical 
registers and the reference systems cannot distinguish between the output of the simulation and the 
ideal protocol. If no common randomness is present and Alice is required to obtain the outcomes 
of the measurement in addition to Bob, then the classical communication needed is equal to the 
Shannon entropy H{X)fj. In the above, H{X\R)a and I{X; R)fj respectively denote the conditional 
entropy and the mutual information of the state axR defined above. For a more detailed statement 
of the theorem, see the proof of Theorem [3] in Section [4j 

3 Distortion observables 

As discussed in the Introduction, in rate distortion theory one allows the data which is recovered 
after the compression-decompression scheme to be distorted by some finite amount from the original 
data. There are various possible choices of the distortion measure, depending on the nature of the 
application. For example, in classical rate distortion theory, the Hamming distance and the mean 
squared error are natural choices of the distortion measure [Sj |6]. In quantum rate distortion 
theory, the distortion measure is usually defined in terms of the entanglement fidelity (see, e.g., 
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Refs. [Ills] and references therein). However, since the distortion is a physical quantity, it is natural 
to associate with it an observable in the quantum setting (as discussed in Section II of Ref. [5] and 
in unpublished work [25J). This is reviewed below. 

In the classical setting, let x £ X denote the letters of a source alphabet and let y S 3^ denote 
the letters of a reconstruction alphabet. Then to determine the distortion between an input and 
output letter, one defines a non-negative cost function d{x,y) (e.g., the Hamming distance or the 
squared error), and the average distortion is then given by 

^^p(x)g(y|x)d(3;,y), (3) 

X y 

where q{y\x) is the conditional probability of getting the letter y after reconstruction when the 
source letter is x, and p{x) is the probability of source letter x. 

In the quantum case, one defines a distortion observable A [5j. For example, suppose that A is 
given by 

A = ^^d{x,y)\x){x\0\y){y\, (4) 

X y 

where \x) are the Schmidt vectors of the following purification of the source state p: 

I'^Ra) = Yl V^^lx) r\x) A, (5) 

X 

so that p = Tr/j{V'^^}. 

Then we recover the expression (|3j) for the average distortion in the classical case as follows. 
Let <I> : B{'Ha) ^ S{T-Ib) denote a map on the source state. Then the average distortion is given 

by 



TV{A((id»<I>)(V^J)} 




= J2dix^y)>^-{y\H\x){x\A)\y). (6) 

x,y 

Let us define q{y\x) = {y\^{\x){x\A)\y) since it can be interpreted as the conditional probability of 
the map $ yielding the letter y, given that the source letter was x. Then setting p{x) = Xx, (since 
Ax, being an eigenvalue of p, is a probability), we recover the expression for the classical average 
distortion as in 

4 Quantum-to-classical rate- distort ion coding 

Consider a memoryless quantum information source {/o, ^^}. In quantum-to-classical (q-c) rate 
distortion, Alice starts with n copies p®" of the source state and performs a POVM A^") = {A^n} 
on it, with the POVM elements A^" G B{T-L^) being indexed by classical sequences G A"" {X 
being a finite alphabet), which correspond to the different possible outcomes of the measurement. 
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Figure 1: The most general protocol for quantum-to-classical rate-distortion coding. Alice has many 
copies of the quantum information source, on which she performs a collective measurement with 
classical output L. She sends the variable L over noiseless classical bit channels to Bob. Bob then 
performs a classical decoding map on L that outputs the classical sequence X". The average 
deviation of this sequence from the quantum source, according to some distortion observable, 
provides a measure of the distortion caused by this protocol. 



It is convenient to define a measurement map A^^(n) corresponding to the POVM A^") as follows: 
For any a„ 

A^^(„)(cT„)= Tr{A,na„}|x")(x'^|. (7) 

The above specifies that with probability {K^-n-an} the outcome of the POVM A^") on the state 
Un is given by the classical sequence x". Figure [T] depicts the most general protocol for quantum- 
to-classical rate-distortion coding. 

If ipRrT-A" denotes a purification of p®", then the following bipartite state characterizes both 
the classical outcome of the POVM A*^") on p®" and the post-measurement state of the purifying 
reference system: 

= ^TrA" {(/fin A^.n)V;^„^„} ® |x'^)(x"|x". (8) 

We define the q-c distortion measure for a state p G V{J-La) with purification IV'^a) ^^d a 
POVM A = {A^} as 

d{p, Mk) ^ Tr (A(id ®Mk) {rRA)) , (9) 

where A^a is the measurement map corresponding to A, and A is a q-c distortion observable given 
by 

A = Aijx = J^A^0|a;)(x|, (10) 

X 

with > 0. 

A q-c rate distortion POVM of rate R is given by a POVM A^") with [2"^] outcomes, i.e., 
A(") = {A^n} with 

#{x" G : A,n / 0} = [2"«J . (11) 
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To define the average distortion resulting from this POVM, we consider a symbol- wise q-c 
distortion observable 

n 



1=1 



where each operator Ar-x^ is of the form ( 10 ) and Ij^x denotes the identity operator acting on all 
but the i^^ member of the tensor-product of Hilbert spaces {'Hji (gi Ti.x)®"'- The average distortion 
is then defined as 

d{p,M^(n)) = Tr (A(")(idi?n ®A^^(„))?/^^„^, 
1 " 

" i=l 

where aR^x, = T^^iCtr^x", with aR^^x^ = (idfln (g)>[^(„))-0^„^„. 

For any R,D>0, the pair (R, D) is said to be an achievable q-c rate distortion pair if there 
exists a sequence of POVMs {A(")} „>i of rate R such that 

lim *(„))< L>. (13) 

The q-c rate distortion function is then defined as 

R'i''{D) = inf{i? : {R,D) achievable}. (14) 



The following theorem provides a single-letter expression for R'^^{D). 

Theorem 3 For a memoryless quantum information source {p^T-La}, « quantum-to-classical dis- 
tortion observable Arx, and any given distortion D > 0, the quantum-to-classical rate distortion 
function is given by 

Ri^D)= min I(X;R)^ (15) 

POVMA={Ai} 



where d{p,A4\) is defined through ^-(10) and 

(TRX = (idR®A^A)(V'^^) = ^TrAKidij^A^OV^^^} ® \x){x\x- (16) 

X 

Proof. We first give the proof of achievability, which follows directly from the measurement com- 
pression theorem [23] (summarized briefly in the previous section). Our approach is similar to one 
used before |23j : exploit a channel simulation protocol and derandomize the common randomness 
consumed by this protocol. So, fix the POVM A = {A^} that minimizes the RHS of (15). Thus we 
have 

d{p, Ma) = Tr [A(idR 0Ma) (V^^^)] < D. (17) 

In Ref. [24], it was proved that there exists a finite set of POVMs {A^™) = {A^T^} 
1,2,..., M}, each having at most L outcomes, i.e., G X"- : A^T'' / 0} < L, with 

^ ^ 2"'f(^;^)-+<^(v^), (18) 
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such that for any e > and n large enough, the POVM A*^") = {Aj,.^} defined as 



M 

- M ^ ' 



m=l 



satisfies the following condition: 

II (idijn (Xi7W^(„)) V^nA" ~ (i'^H" (Xi-A^A®")) V'^nA"!!! - ^' 



(19) 



(20) 



where for any sequence = xi . . . x„ G A"" we have A^^ = Ax-^ (E> ■ ■ ■ A.x„ ■ Further, due to our 
choice (12) of a symbol-wise q-c distortion observable A^"), we have that 



d{p,MAe>n) = Tr[A(")(idifn (g)A^A®")V'flnAn] = d{p,MA) 



(21) 



From (20), we know that the protocol for simulating the tensor-product measurement has measure- 



ment encodings {A^"*^}. Let (/) denote the corresponding classical decodings which construct 
the sequences from the values of / and m, where / is the measurement outcome and m is the 
common randomness. Then 



(idfl- (E)Ma^^) (V-^nAn) - ]^ E^^A" {(idijn ® A[" } (g) (l)) (p^™) (/) 

m,l 



< e. 



Then using (17) we obtain a bound on the average distortion resulting from the action of the POVM 

Ain) 

on the source state p®"' as follows: 
(I(p,M^(„)) 

= TV |a(")^ J^TrA. {(id«n ®a['"))V'^„^„} (/)) (P(-) (0 

= 1^^"^ Et^A" {(idK- ^A^^)rRr.A-} ® Iv^""^ (/)) {V^"^^ (/) 

TTL \ I 



A(") 



I (Xm ax I 



where dmax is the maximum eigenvalue of A*^"). Also, in the above, we see how it is possible to 
derandomize the common randomness: there exists a choice of the m such that 



Tr I A(") ^Tr^n {id,jn ^Aj^V^n^n} (O) (^^™^ (0 



(22) 



Hence, 



lim d{p,'DnO M^(n)) < D. 
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Thus, a measurement compression protocol directly yields a q-c rate distortion protocol. 

Now we give a proof for the converse. Let A(") : A" L be a POVM with A(") 
let "Dn ■ L ^ be a decoding map (with L and X" denoting classical systems) such that 

lim d{p,VnO M^i^n)) < D, 



{aJ"^}, and 



(23) 



where Aiun) is the measurement map corresponding to the POVM A*^"-*. Defining an^i 



(idfin ^Mf^{„))ip' 



we have ai = E/ Tr(A["V®' 

nR > H{L)„ 

>I{L;Rna 
>/(X";i?"), 



and 



(24) 



The first inequality holds because the entropy H{L)a is upper bounded by the entropy nR of the 
uniform distribution. In the second line, the inequality follows because I{L] R^)a = H{L)^ — 
H{L\K^)fj and H{L\R^)fj > since L is classical. In the third line, ujx^r^d = (idi?" ^T^njc^R^L- 
This inequality follows from the quantum data processing inequality (Lemma [T]). Continuing, we 
have 



RHSof([24]) >^I{Xi]Ri) 

i=l 

n 

i=l 
i=l 

>nR^^ [^id(p,4«)) 



\j=i 
> nRi%D), 



(25) 



for n sufficiently large. In the above, Jn is the marginal operation on the i-th. copy of the 
source space induced by the overall operation Vn o M.j^(„). The first inequality follows from the 
superadditivity of the quantum mutual information (Lemma [2]). The second inequality follows from 



the fact that the map J^n^ has distortion d{p, Tn^), which, by definition (|14|), is lower bounded by the 



q-c rate distortion function corresponding to this distortion. The last two inequalities follow from 
the convexity of the q-c rate distortion function, from the assumption that the average distortion 
of the protocol is less than or equal to D for n large enough, i.e., 



" 1 

—d{p,T^^) < D, for n sufficiently large 



and the fact that R'^'^{D) is a non- increasing function of D. 



A natural choice for each A^; in the distortion observable in ( 10 ) is 



A^. = / - \x){x\. 



(26) 
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For such a choice, the distortion of the classical data, resulting from the measurement, is measured 
with respect to the classical data that would result from an ideal measurement of the source state 
p in its eigenbasis. However, such a choice is effectively classical b ecause the operators <ire 
diagonal in the Schmidt basis of i^ra- show in Lemma |4] below that, for such a choice of 
the distortion observable, the best strategy for rate-distortion coding amounts to an effectively 
classical strategy, in which Alice measures each output of the source state in its eigenbasis, thus 
obtaining a classical sequence, which she then compresses by applying the purely classical protocol 
for Shannon's rate-distortion coding. Thus, a necessary condition for there to be a quantum 
advantage in quantum-to-classical rate distortion coding is that the operators A^,. should not be 
diagonal in the Schmidt basis of V'^yi- After Lemma |4| we provide an example of a quantum source 
and a distortion observable for which quantum-to-classical rate distortion coding gives an advantage 
over the above classical strategy. 



Lemma 4 If each operator A.x, in the definition (10) of the distortion observable, is diagonal 



in the Schmidt basis of iPra' then a quantum-to-classical rate distortion coding scheme has no 
advantage over a classical scheme, in the following sense: the optimal measurement map is a von 
Neumann measurement in the eigenbasis of the source state, followed by classical post-processing of 
the measurement result according to Shannon's rate distortion theory. 



Proof. Let A denote the minimal POVM in (15) for a given distortion D, and let Ma denote the 
corresponding measurement map. Let the Schmidt decomposition of the purification V'^yi of the 
source state p G V{J-La) be as follows: 



2 

where p{z) are the Schmidt coefficients. Then the distortion that the map A^a causes is as follows: 
Tr{ARx(idR®A^A) [Vra)] 

= Tri [y,^x®\x){x\A Vp(^)p(^')k)(^'L®Tr{A,|z) {z'\^]\y) (y| 

\ / \z,z',y 

= Y Vp(.z)piz') \z) {z'\Ax\z)j^, 

x,z,z' 

which is equivalent to 

Iv|(^J;A.®A.^ (V'Uj- (27) 
Now suppose that each A^^ is diagonal in the Schmidt basis of the reference system, so that 

z 

Then the above expression for the distortion reduces to the following one: 

^p(z)(z|A,|z)^ {z\Ax\z)a. (28) 
x,z 
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Consider ~ Ha and choose and to be identical bases, which we simply denote 



as {l-z)}. Then (28) implies that, starting from the original POVM A, we can construct another 
POVM A' (say) which is diagonal in the eigenbasis of p, and which results in a distortion equal to 
that caused by the original POVM. The POVM A' is given by 

A' := {A'^} , where h'^ := {z\ A^ \z) \z) {z\ . 

z 

Clearly, the following identity holds 

^p(z)(z|A^|z) {z\A^\z) =^p{z){z\A^\z) {z\A'^\z). 

x,z x,z 

The joint state of the reference system and the post-measurement classical register, resulting from 
the POVMs A and A', are respectively given as follows: 

TRX = {^dR(g)MA)iijRA)^ and fj^j^ = (ids ^AIaOIV'ka)' (29) 

where A4a' is the measurement map corresponding to the POVM A'. It turns out that the mutual 
information I{X; R)„ can only be smaller than I{X; R)a'- This can be seen as follows. Note that 
we can equivalently write the state a^x as 

(^RX = {^/p^l^/p)R \x) {x\x ■ (30) 

X 

Then the state cr'jix ^^n be written as 

'^'rx = Y {Vp{^{A^l\z)\z){z\)^/p\ ®\x){x\x. 

X \ Z / R 

Since {\z)} is the eigenbasis of [y^, = 0, and hence the above state is equivalent to the 

following one: 

^ {z\ (v^A^v^) \z) \z) {z\}j (g) \x) {x\x , 

which is a classical-classical state. Note that such a state is equivalent to the state which would 
result from the action of a completely dephasing channel on the reference system R of the state 




(^RX given by (30), i.e., a'j^x — (-^ ® id)(7jjxi where J\f denotes a completely dephasing channel. 
The mutual information can only decrease under such a map and hence I{X;R)^i < I{X;R)o-. 
This implies that in this case the optimal measurement to perform on the source is a von Neumann 
measurement in the eigenbasis of p, followed by classical post-processing according to the conditional 
distribution given by p{x\z) = {z\Ax\z) (that this is a distribution follows from the fact that 
Aa; = /). Thus, this is equivalent to what one would obtain by exploiting Shannon's rate 
distortion theorem in a straightforward way. ■ 

The following example illustrates a scenario in which a quatum-to-classical rate distortion coding 
gives an advantage over a purely classical strategy. 
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Figure 2: A plot of compression rate vs. distortion for the quantum information source p given 
by (31), and the rate distortion observable given by (32). It was obtained by randomly sampling 
250,000 two-outcome POVMs, and (for those POVMs which satisfy the distortion criterion D < 
1/4) plotting the mutual information I{X; R)fj for the resulting state anx (defined by (16)) against 
the corresponding value of the distortion. The boundary of the shaded region defines the rate- 
distortion trade-off curve. 



Example: Consider a quantum information source which generates the states |+) and |0) with 
equal probability 1/2, so that the density operator for the source is 

p = 1/2 (1+) (+1 + |0) (0|) = cos2 (vr/S) |0o) (0o| + sin^ (vr/S) (<Ai| , (31) 

where 

|0o) = cos (vr/S) 1 0) + sin (vr/S) |1) , 
|0i) = sin (vr/S) |0) - cos (vr/S) |1) . 

A purification of the source state p is given by 

\'^ra) = COS (vr/S) \(t)o) r\(Po) A + sin (vr/S) \(t)i) r\(Pi) a ■ 

Suppose we are interested in measuring the distortion of the classical data (obtained as a result 
of a quantum-to-classical rate distortion task), by how much it deviates from the quantum states 
that specify the ensemble of the quantum information source. In this case, we would choose our 
distortion observable to be as follows: 

^RX = {I- 1+) {+\)r ^ |0) (0|^ + (/ - |0) (0|)^ |1) (1|^ . (32) 

Note that if we consider a two-outcome POVM A = {Aq, Ai}, where Aq = 1/2 = Ai, then the 
state aRx defined by ( 16 ) is given by 

/ 

(^RX = PR^ ^, 
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where 

/OR = Tr^{<^} = cos^ (^8) |,^o> (00^, + sin^ (vr/S) {4>i\r . 



In this case, the choice (32) of the distortion observable yields the following value of the distortion: 

D = Tt {/\rx {idR ® Mk) (V^^) } = 1/4. 

Moreover, since the state (Trx is uncorrelated, we have that I{X] R)fj = 0, and hence, by Theorem|3| 
the rate distortion function R'^'^{D) is equal to zero. This implies that to obtain the full rate- 
distortion trade-off curve, one only needs to consider values of the distortion D in the range < 
D < 1/4. 

The rate-distortion trade-off curve, for the above range of values of was obtained numerically 



for the rate distortion observable defined by (32), and is given by the boundary of the shaded region 
in Fig. [2] As expected, the curve decreases monotonically with D. 

To prove that in this case a quantum-to-classical rate distortion coding gives an advantage over 
a purely classical strategy, consider a two-outcome POVM A which corresponds to a von Neumann 
measurement in the eigenbasis of the source state p, i.e., A = {Aq, Ai}, where 

Ao = |</'o)(</'o| and Ai = |</)i)(</)i|, 

or, more generally, consider any A = {Aq, Ai} such that AA(Aj) = A,, for i = 0, 1, where denotes 
a dephasing channel, with the dephasing being in the eigenbasis of p. In this case one finds that. 



if the distortion observable is chosen as in (32), the distortion is always equal to the maximum 
allowed value D = 1/4. This implies that for distortion in the range < D < 1/4, for the choice 
(32), quantum-to-classical rate-distortion coding gives an advantage over a classical strategyF] 



5 Quantum-to-classical rate-distortion coding with quantum side 
information 

We now consider a class of protocols in which Alice and Bob share many copies of some quantum 
state pAB- This state can be considered to arise from the action of an isometry on the state of 
a memoryless quantum information source performed by a third party (say, Charlie), who then 
distributes the systems A and B to Alice and Bob, respectively. The system B acts as Bob's 
quantum side information. We also let Alice and Bob share common randomness. The goal is to 
quantify the minimum rate at which Alice needs to send classical data to Bob, such that he can 
reconstruct a classical approximation of the state pA = Tr^ {pab} by using the received classical 
data and his quantum side information. By a "classical approximation," we mean that for a fixed 
distortion D > 0, where the distortion is defined as 

d(p,A^A) = Tr{Ai?xB(idH®A^A®idB)(T^^^^)}, (33) 

and a chosen distortion observable A of the following form: 

Arbx = Y.^Rb(^\^){Ax^ (34) 

X 

"classical strategy" here corresponds to a measurement in the eigenbasis of p, followed by classical post- 
processing. 
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Reference 

R" 




Figure 3: The most general protocol for quantum-to-classical rate-distortion coding with quantum 
side information. Alice and Bob share many copies of a quantum state pab, which is purified by 
an inaccessible reference system. We also allow them access to common randomness M before 
the protocol begins. Alice first performs a collective measurement on her systems, producing a 
classical output L. She then transmits L over noiseless classical bit channels to Bob. Bob performs 
a collective measurement on his quantum systems, depending on what he receives from Alice and 
his share of the common randomness. This measurement produces a classical sequence X" and 
has quantum outputs as well. The protocol is deemed successful if the classical sequence X" is 
not distorted on average from the quantum source more than a specified amount according to a 
suitable distortion observable. We also demand that the disturbance caused by the protocol to the 
joint state of the reference and Bob's systems is asymptotically negligible. This in turn implies that 
quantum side information suffers a negligible disturbance and hence is available to Bob for future 
use. 



we require that ( |13| ) is satisfied. The rate distortion function in this scenario is defined in a manner 
analogous to R'^^{D) of the previous section, and is denoted as i?^^^ (D). In the above, ipj^y^^ is a 
purification of the state pab, and since we are interested in measuring the distortion that occurs 



on the A system only, the operators A|j^ in (34) should act on all systems that purify the A 



system. Figure |3] depicts the most general protocol for quantum-to-classical rate-distortion coding 
with quantum side information. 

Ref. ^2] contains a theorem that determines the optimal rates for measurement compression 
in the presence of quantum side information. It almost immediately leads to the following rate 
distortion theorem: 

Theorem 5 For a memoryless quantum information source characterized by a state pab (where 
Alice possesses A and Bob possesses B), a quantum-to- classical distortion observable /S.rbx, CLnd 
any given distortion D >0, an achievable rate for quantum-to-classical rate distortion with quantum 
side information, when sufficient common randomness is available, is given by 

min I(X;R\B)^, (35) 

A : d(p,MA)<D ^ ' I ' ^ ^ 



SO that 



RldD)<^ mm I{X;R\B)^, (36) 

^ A : d{p,MA)<D 
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where A = {A^^} is a POVM acting only on Alice's system, d{p,M\) is defined through (33)-{34), 
and ipjij^^ is a purification of the state pAB- The state a is the following classical- quantum state: 

<yxRB = XI 1^) (^^Ix ® Tr^ {{Ir ®K® Is) {Vrab) ] ■ (37) 

X 

Proof. The proof of the achievabihty part of this theorem proceeds similarly to that of Theorem |3] 



We merely fix the POVM that minimizes the RHS of (35). From this POVM, we can construct a 
protocol for measurement compression with quantum side information by invoking Theorem 12 of 
[22| . This protocol exploits classical communication at a rate I {X; R\B)^ and common randomness 
at a rate H {X\RB)^ in order to simulate the action of the POVM on many copies of the state pab- 
By an argument similar to that in the proof of the achievabihty part of Theorem [3| we know that 
such a protocol meets the distortion criterion and that it is possible to derandomize the common 



randomness in the same way as in (22). ■ 

If, in addition, we demand that the protocol causes asymptotically negligible disturbance of 
the state of Bob (i.e., the quantum side information) and the state of the reference system, then 



we can prove that the upper bound in (36) is achieved. Hence, in this case, the rate distortion 



function, which we denote as is given by a single- letter formula. The requirement of the 

protocol leaving the states of Bob and the reference essentially undisturbed might seem somewhat 
restrictive at first. However, it can be justified as follows. Firstly, note that ignoring the quantum 
side information leads to a protocol with a classical communication rate of I{X;RB) which of 
course does not disturb the systems of the reference and Bob in any way. Secondly, Bob might wish 
to use the quantum side information in some future information-processing task, which therefore 
leads to the above requirement on the state of his system. In light of this, it seems reasonable to 
restrict consideration to a class of protocols in which Bob is allowed to exploit the quantum side 
information, but only in a way which causes negligible disturbance to it. These considerations yield 
the following theorem: 

Theorem 6 For a memoryless quantum information source pAB (where Alice possesses A and Bob 
possesses B), a quantum-to-classical distortion observable Ajibx, o-nd any given distortion D >0, 
the quantum-to- classical rate distortion function with quantum side information, sufficient common 
randomness, and such that the protocol causes only a negligible disturbance to the systems of the 
reference and Bob, is given by 

^ A : d{p,MA)<D 

where the state a is as defined in (31) of Theorem^ 

Proof. The proof of the achievabihty part of this theorem again follows directly from Theorem 12 
of Ref. [22.J which deals with measurement compression in the presence of quantum side information. 



We merely fix the map that minimizes the expression in ( 38 ) and apply the aforementioned theorem 



The resulting protocol meets the distortion constraint because of the way that the POVM is chosen 



in (38). 



The converse part of this theorem exploits the approach from the converse parts of Theorems 12 
and 14 of Ref. |22j . which in turn exploit ideas of Cuff [7]. The most general protocol begins with the 
state (V'rab)'^"^ shared between the reference, Alice, and Bob. We let Alice and Bob share common 
randomness as well (embodied in some random variable M). Alice performs an encoding on her 
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systems A" with the help of her share of the common randomness M, producing a classical output 
given by the random variable L which takes values in a finite alphabet C. Let a denote the state at 
this point. Also, let R be the rate of classical communication, i.e., R = (log2 \C\)/n. Alice sends L 
to Bob, who then combines this with his share of the common randomness to perform some decoding 
map on 5", producing a classical sequence AT" and a quantum system B'"". Let uj denote the final 
state after this encoding-decoding procedure. Further, let j^^^b^^x^b " fjgnote the effective CPTP 
map on (the state that Alice and Bob share at the start of the protocol) resulting from these 
encoding and decoding operations. We demand that the distortion of the output A" be no larger 



than D (in a sense similar to that in (23), though in this case we need to trace over the systems 
B'"-), and we furthermore demand that the trace distance between (V'rb) " ^^"^ state ujin^m 
on systems R^B'"^ (at the end of the protocol) be no larger than some arbitrarily small £ > 0. The 
converse then proceeds as follows. For n large enough, 

nR > H{L)^ 

> I (L; MB'^R'')^ 

= I {LMB''- ii")^ + I (L; M5")^ - / (i?"; B"M)^ 

> / (LMS"; ii")^ - I (i?"; S")^ 

> / (A"S'"; i?")^ - / (i?"; 5'")^ - ne' 
>Y,[I {XkB'k, Rk)^ - I (Rk-, B',)] - 2ne' 

k 

= Y,l{Xu;Rk\B'k)^-2ne'. (39) 

k 

The first inequality follows because the entropy of a system is always less than the logarithm of 
its dimension. The second inequality follows because / (L; MB^R^)^ = H (L)^ — H {L\M B^ R^) ^ 
and H {L\MB"R")^ > for a classical L. The first equality is an identity for quantum mutual 
information. The third inequality follows because the common randomness M is in a product 
state with so that I {R"" ; B"" M) ^ = I{R'';B'^)^ and because I{L;MB'")^ > 0. The fourth 

inequality follows from quantum data processing. Lemma [T| (the systems LMB"" are processed to 
produce systems X"'B'"), and from the requirement that the protocol causes negligible disturbance 
of the state of R'^B'^. The term e' (which is a function of e) arises from an application of the Alicki- 
Fannes' inequality [T], where lim£_^o^' (^) = 0. The fifth inequality follows from superadditivity of 
quantum mutual information (Lemma [2]) and because the state on R^B'" is close in trace distance 
to a tensor-product state (see Lemma 10 of Ref. i22j). The second equality follows from the identity 
/ {XkB',; Rk)^ - I {R,; B',) = I (A,; Rk\B',)^. 

At this point, we have argued that the above lower bound holds for a protocol that exploits 
common randomness and classical communication to implement a map 

jrA"B-^X-B"^_ This map 

meets the distortion constraint while also causing only a negligible disturbance to the state on 
R^B"^, in the sense that 



rp ( tA"B"^X"B'" f / I P \®"M /; \®n 



< e. 

1 



As in the proof of Theorem 14 of Ref. [22], applying Uhlmann's theorem to the above condition 
guarantees that there is some map acting only on Alice's system, such that the information quantity 



in the last line of the above chain of inequalities (39) does not change too much. For completeness 
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we repeat the argument here. Let the Kraus representation of j^^^B"^X"B'^ given by 

i 

A purification of Trx" | J^^"-B"^X"_b'" ^(-f/^^^^)®"^ | is given by 

Y.P^mAB)^<^^\^)n (40) 

i 

where / is a purifying system, while a purification of (V'^s)'^" (I^'hab))'^"- Uhlmann's theo- 
rem, there is an isometry u^"'^^"'^ acting only on Alice's system, taking (jV'^yi^))'^" to an approxi- 
mation of the state in ( [4o| such that the trace distance between this state and jj^"^^^"! (IV'^yis))'^'^ 
is at most 2^/e. Thus, the map on Alice's side consists of applying u^"^-*^"! and tracing out /. 
Let w' denote the resulting state. By exploiting this map instead of the original one, we find the 



following lower bound on the information quantity in ( 39 ) : 



Y,I{Xk-^Rk\Bk)^,-'ine'. 



The important feature of this approximation map is that it acts only on Alice's side. Continuing, 
we have 



k 

n 



n 

k 



> nRl^ [D) - 3ne'. 



In the above, Qn^ is the marginal operation on the fc*^ copy of the source space induced by the 
overall encoding and approximation of the decoding guaranteed by Uhlmann's theorem. The first 
inequality follows from the fact that the map Gn'^ has distortion d{p, Qn^) and the expression ( [ssl 
for the rate-distortion function iZ^^^ in Theorem involves a minimum over all maps on Alice's 
system with this distortion. The first equality is obvious. The last two inequalities follow because 
the rate-distortion function is convex and non-increasing as a function of D (the proof of convexity 
is similar to the proof of Lemma 14 of Ref. [8j, though here we rely on the map acting solely on 
Alice's system). ■ 

A special case of the above theorem is the setting considered in Theorem 4.2 of Ref. |13j . There, 
Luo and Devetak considered the scenario in which the source is a classical-quantum state of the 
form: 

(y) \y) {y\Y®P%^ 
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where Alice possesses Y and Bob B. The goal is for Alice to transmit her classical data to Bob 
up to some distortion, and Bob is allowed to use the quantum side information to help reduce the 
communication costs. They proved that the following rate is achievable: 

min I(X;Y\B) , 

Px\Y{x\y) ■■ E{d(z,y)}<D 

for some classical distortion measure d (x, y) and where the information quantity is with respect to 
a state of the following form: 



Luo and Devetak were not able to find a single-letter characterization of the rate-distortion function, 
but with our additional assumptions of sufficient common randomness and a negligible disturbance 
of the quantum side information, our theorem reduces to a single-letter characterization for their 



setting. In fact, if one chooses the distortion observable in (34) so that the operators A|j^ are 
diagonal in the Schmidt basis of the RB systems of '4'rab ' then a similar statement as in Lemma [4] 
applies. That is, in this case, it is optimal to measure the A system in the eigenbasis of pA and 
proceed according to the protocol of Luo and Devetak in Ref. [13j. As stated above, their protocol 
is optimal if we demand that it cause only a negligible disturbance to the state of the reference and 
Bob. 



6 Conclusions and discussions 

We have derived a single-letter formula for the quantum-to-classical rate distortion function. The 
goal in quantum-to-classical rate-distortion coding is to provide a compressed classical approxima- 
tion of a quantum source, up to some specified level of distortion, as determined by a distortion 
observable. The formula is expressed as a minimization of a Holevo quantity over all quantum-to- 
classical channels that meet the distortion constraint. In general, our results show that a collective 
measurement of the quantum source is required to obtain optimal compression rates. However, if 
the distortion observable has a classical form (so that each operator A^,. is diagonal in the Schmidt 
basis), then the best strategy for quantum-to-classical rate-distortion coding ends up being an ef- 
fectively classical strategy, in which Alice performs individual measurements of each copy of the 
source in its eigenbasis, and processes the resulting classical data according to Shannon's classical 
rate-distortion protocol. 

We have also derived a single-letter formula for the quantum-to-classical rate distortion function 
when the receiver has some quantum side information about the source. Our assumptions are that 
Alice and Bob share sufficient common randomness, and that the protocol causes only a negligible 
disturbance to the joint state of the reference and the quantum side information. We consider this 
latter assumption to be rather natural, since Bob might wish to make use of his quantum side 
information in some future protocol. Our results suggest that it might generally be possible for 
quantum information-theoretic protocols that employ quantum side information to be simplified 
by employing this assumption, due to the restriction that it imposes on the quantum states at the 
output of a given protocol. This assumption is purely non-classical, since it is always possible to 
copy classical information before processing it in any way. 
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There are some interesting open questions to consider going forward from here. It would be 
ideal if we could derandomize the common randomness in the protocol that uses quantum side 
information, since it would imply that this extra resource is unnecessary. However, if we did so, 
the protocol for measurement compression with quantum side information could end up causing a 
non-negligible disturbance to the joint state of the reference and Bob's systems, for some of the 
values of the common randomness. Since our approach in the proof of the achievability part of 
the coding theorem relies on this protocol, we have not been able to conclude that the common 
randomness is unnecessary. However, the common randomness plays only a passive role in the 
converse theorem, and this suggests that it might ultimately be unnecessary. In order to determine 
if this is the case, one would have to consider a different protocol in proving the achievability part 
of the coding theorem. 
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